Local Term Weight Models from Power Transformations: Development of BM25IR: A Best Match Model based on Inverse Regression
نویسنده
چکیده
In this article we show how power transformations can be used as a common framework for the derivation of local term weights. We found that under some parametric conditions, BM25 and inverse regression produce equivalent results. As a special case of inverse regression, we show that the largest increment in term weight occurs when a term is mentioned for the second time. A model based on inverse regression (BM25IR) is presented. Simulations suggest that BM25IR works fairly well for different BM25 parametric conditions and document lengths.
منابع مشابه
Development of An Artificial Neural Network Model for Asphalt Pavement Deterioration Using LTPP Data
Deterioration models are important and essential part of any Pavement Management System (PMS). These models are used to predict future pavement situation based on existence condition, parameters causing deterioration and implications of various maintenance and rehabilitation policies on pavement. The majority of these models are based on roughness which is one of the most important indices in p...
متن کاملEstimation of Variance Components for Body Weight of Moghani Sheep Using B-Spline Random Regression Models
The aim of the present study was the estimation of (co) variance components and genetic parameters for body weight of Moghani sheep, using random regression models based on B-Splines functions. The data set included 9165 body weight records from 60 to 360 days of age from 2811 Moghani sheep, collected between 1994 to 2013 from Jafar-Abad Animal Research and Breeding Institute, Ardabil province,...
متن کاملEstimation of stem biomass of Pupulus. caspica and Pupulus. alba 57.58 seedlings based on allometric relations (Case study: Chamestan, Noor)
Today, the use of allometric relations for easy, non-destructive, low cost and fast estimation of tree biomass is widely used. The aim of this study was to estimate the stem biomass of Pupulus. caspica and Pupulus. alba 57.58 seedlings. First, 45 cuttings from both fast-growing tree species were selected and planted in a randomized complete block design with three replications in the lands of C...
متن کاملکاربرد مدل رگرسیون لجستیک درختی در تعیین رویشگاه بالقوه گونه گیاهی بالقوه گونه گیاهی گون زرد Astragalus verus
The relationship between plant species and environmental factors has always been a central issue in plant ecology. With rising power of statistical techniques, geo-statistics and geographic information systems (GIS), the development of predictive habitat distribution models of organisms has rapidly increased in ecology. This study aimed to evaluate the ability of Logistic Regression Tree model ...
متن کاملApplication of non-linear regression and soft computing techniques for modeling process of pollutant adsorption from industrial wastewaters
The process of pollutant adsorption from industrial wastewaters is a multivariate problem. This process is affected by many factors including the contact time (T), pH, adsorbent weight (m), and solution concentration (ppm). The main target of this work is to model and evaluate the process of pollutant adsorption from industrial wastewaters using the non-linear multivariate regression and intell...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1608.01573 شماره
صفحات -
تاریخ انتشار 2016